{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using the SageMaker TensorFlow Serving Container\n",
    "\n",
    "The [SageMaker TensorFlow Serving Container](https://github.com/aws/sagemaker-tensorflow-serving-container) makes it easy to deploy trained TensorFlow models to a SageMaker Endpoint without the need for any custom model loading or inference code.\n",
    "\n",
    "In this example, we will show how deploy one or more pre-trained models from [TensorFlow Hub](https://www.tensorflow.org/hub/) to a SageMaker Endpoint using the [SageMaker Python SDK](https://github.com/aws/sagemaker-python-sdk), and then use the model(s) to perform inference requests."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup\n",
    "\n",
    "First, we need to ensure we have an up-to-date version of the SageMaker Python SDK, and install a few\n",
    "additional python packages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -U --quiet \"sagemaker>=1.14.2\"\n",
    "!pip install -U --quiet opencv-python tensorflow-hub"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we'll get the IAM execution role from our notebook environment, so that SageMaker can access resources in your AWS account later in the example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker import get_execution_role\n",
    "\n",
    "sagemaker_role = get_execution_role()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download and prepare a model from TensorFlow Hub\n",
    "\n",
    "The TensorFlow Serving Container works with any model stored in TensorFlow's [SavedModel format](https://www.tensorflow.org/guide/saved_model). This could be the output of your own training job or a model trained elsewhere. For this example, we will use a pre-trained version of the MobileNet V2 image classification model from [TensorFlow Hub](https://tfhub.dev/).\n",
    "\n",
    "The TensorFlow Hub models are pre-trained, but do not include a serving ``signature_def``, so we'll need to load the model into a TensorFlow session, define the input and output layers, and export it as a SavedModel. There is a helper function in this notebook's `sample_utils.py` module that will do that for us."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sample_utils\n",
    "\n",
    "model_name = 'mobilenet_v2_140_224'\n",
    "export_path = 'mobilenet'\n",
    "model_path = sample_utils.tfhub_to_savedmodel(model_name, export_path)\n",
    "\n",
    "print('SavedModel exported to {}'.format(model_path))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After exporting the model, we can inspect it using TensorFlow's ``saved_model_cli`` command. In the command output, you should see \n",
    "\n",
    "```\n",
    "MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:\n",
    "\n",
    "signature_def['serving_default']:\n",
    "...\n",
    "```\n",
    "\n",
    "The command output should also show details of the model inputs and outputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!saved_model_cli show --all --dir {model_path}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Optional: add a second model\n",
    "\n",
    "The TensorFlow Serving container can host multiple models, if they are packaged in the same model archive file. Let's prepare a second version of the MobileNet model so we can demonstrate this. The `mobilenet_v2_035_224` model is a shallower version of MobileNetV2 that trades accuracy for smaller model size and faster computation, but has the same inputs and outputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "second_model_name = 'mobilenet_v2_035_224'\n",
    "second_model_path = sample_utils.tfhub_to_savedmodel(second_model_name, export_path)\n",
    "\n",
    "print('SavedModel exported to {}'.format(second_model_path))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we need to create a model archive file containing the exported model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a model archive file\n",
    "\n",
    "SageMaker models need to be packaged in `.tar.gz` files. When your endpoint is provisioned, the files in the archive will be extracted and put in `/opt/ml/model/` on the endpoint. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tar -C \"$PWD\" -czf mobilenet.tar.gz mobilenet/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Upload the model archive file to S3\n",
    "\n",
    "We now have a suitable model archive ready in our notebook. We need to upload it to S3 before we can create a SageMaker Model that. We'll use the SageMaker Python SDK to handle the upload."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.session import Session\n",
    "\n",
    "model_data = Session().upload_data(path='mobilenet.tar.gz', key_prefix='model')\n",
    "print('model uploaded to: {}'.format(model_data))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a SageMaker Model and Endpoint\n",
    "\n",
    "Now that the model archive is in S3, we can create a Model and deploy it to an \n",
    "Endpoint with a few lines of python code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.tensorflow.serving import Model\n",
    "\n",
    "# Use an env argument to set the name of the default model.\n",
    "# This is optional, but recommended when you deploy multiple models\n",
    "# so that requests that don't include a model name are sent to a \n",
    "# predictable model.\n",
    "env = {'SAGEMAKER_TFS_DEFAULT_MODEL_NAME': 'mobilenet_v2_140_224'}\n",
    "\n",
    "model = Model(model_data=model_data, role=sagemaker_role, framework_version='1.13', env=env)\n",
    "predictor = model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Make predictions using the endpoint\n",
    "\n",
    "The endpoint is now up and running, and ready to handle inference requests. The `deploy` call above returned a `predictor` object. The `predict` method of this object handles sending requests to the endpoint. It also automatically handles JSON serialization of our input arguments, and JSON deserialization of the prediction results.\n",
    "\n",
    "We'll use these sample images:\n",
    "\n",
    "<img src=\"kitten.jpg\" align=\"left\" style=\"padding: 8px;\">\n",
    "<img src=\"bee.jpg\" style=\"padding: 8px;\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# read the image files into a tensor (numpy array)\n",
    "kitten_image = sample_utils.image_file_to_tensor('kitten.jpg')\n",
    "\n",
    "# get a prediction from the endpoint\n",
    "# the image input is automatically converted to a JSON request.\n",
    "# the JSON response from the endpoint is returned as a python dict\n",
    "result = predictor.predict(kitten_image)\n",
    "\n",
    "# show the raw result\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Add class labels and show formatted results\n",
    "\n",
    "The `sample_utils` module includes functions that can add Imagenet class labels to our results and print formatted output. Let's use them to get a better sense of how well our model worked on the input image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# add class labels to the predicted result\n",
    "sample_utils.add_imagenet_labels(result)\n",
    "\n",
    "# show the probabilities and labels for the top predictions\n",
    "sample_utils.print_probabilities_and_labels(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Optional: make predictions using the second model\n",
    "\n",
    "If you added the second model (`mobilenet_v2_035_224`) in the previous optional step, then you can also send prediction requests to that model. To do that, we'll need to create a new `predictor` object.\n",
    "\n",
    "Note: if you are using local mode (by changing the instance type to `local` or `local_gpu`), you'll need to create the new predictor this way instead:\n",
    "\n",
    "```\n",
    "predictor2 = Predictor(predictor.endpoint, model_name='mobilenet_v2_035_224', \n",
    "                       sagemaker_session=predictor.sagemaker_session)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.tensorflow.serving import Predictor\n",
    "\n",
    "# use values from the default predictor to set up the new one\n",
    "predictor2 = Predictor(predictor.endpoint, model_name='mobilenet_v2_035_224')\n",
    "\n",
    "# make a new prediction\n",
    "bee_image = sample_utils.image_file_to_tensor('bee.jpg')\n",
    "result = predictor2.predict(bee_image)\n",
    "\n",
    "# show the formatted result\n",
    "sample_utils.add_imagenet_labels(result)\n",
    "sample_utils.print_probabilities_and_labels(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Additional Information\n",
    "\n",
    "The TensorFlow Serving Container supports additional features not covered in this notebook, including support for:\n",
    "\n",
    "- TensorFlow Serving REST API requests, including classify and regress requests\n",
    "- CSV input\n",
    "- Other JSON formats\n",
    "\n",
    "For information on how to use these features, refer to the documentation in the \n",
    "[SageMaker Python SDK](https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst).\n",
    "\n",
    "## Cleaning up\n",
    "\n",
    "To avoid incurring charges to your AWS account for the resources used in this tutorial, you need to delete the SageMaker Endpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictor.delete_endpoint()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_tensorflow_p36",
   "language": "python",
   "name": "conda_tensorflow_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
